adversary regime
The Curious Case of Adversarially Robust Models: More Data Can Help, Double Descend, or Hurt Generalization
Min, Yifei, Chen, Lin, Karbasi, Amin
Despite remarkable success, deep neural networks are sensitive to human-imperceptible small perturbations on the data and could be adversarially misled to produce incorrect or even dangerous predictions. To circumvent these issues, practitioners introduced adversarial training to produce adversarially robust models whose predictions are robust to small perturbations to the data. It is widely believed that more training data will help adversarially robust models generalize better on the test data. In this paper, however, we challenge this conventional belief and show that more training data could hurt the generalization of adversarially robust models for the linear classification problem. We identify three regimes based on the strength of the adversary. In the weak adversary regime, more data improves the generalization of adversarially robust models. In the medium adversary regime, with more training data, the generalization loss exhibits a double descent curve. This implies that in this regime, there is an intermediate stage where more training data hurts their generalization. In the strong adversary regime, more data almost immediately causes the generalization error to increase.
More Data Can Expand the Generalization Gap Between Adversarially Robust and Standard Models
Chen, Lin, Min, Yifei, Zhang, Mingrui, Karbasi, Amin
As modern machine learning models continue to gain traction in the real world, a wide variety of novel problems have come to the forefront of the research community. One particularly important challenge has been that of adversarial attacks (Szegedy et al., 2013; Goodfellow et al., 2014; Kos et al., 2018; Carlini & Wagner, 2018). To be specific, given a model with excellent performance on a standard data set, one can add small perturbations to the test data that can fool the model and cause it to make wrong predictions. What is more worrying is that these small perturbations can possibly be designed to be imperceptible to human beings, which raises concerns about potential safety issues and risks, especially when it comes to applications such as autonomous vehicles where human lives are at stake. The problem of adversarial robustness in machine learning models has been explored from several different perspectives since its discovery. One direction has been to propose attacks that challenge these models and their training procedures (Carlini & Wagner, 2017; Gu & Rigazio, 2014; Athalye et al., 2018; Papernot et al., 2016a; Moosavi-Dezfooli et al., 2016).